Pathum Thani
Counterfactual Basis Extension and Representational Geometry: An MDL-Constrained Model of Conceptual Growth
Concept learning becomes possible only when existing representations fail to account for experience. Most models of learning and inference, however, presuppose a fixed representational basis within which belief updating occurs. In this paper, I address a prior question: under what structural conditions can the representational basis itself expand in a principled and selective way? I propose a geometric framework in which conceptual growth is modeled as admissible basis extension evaluated under a Minimum Description Length (MDL) criterion. Experience, whether externally observed or internally simulated, is represented as vectors relative to a current conceptual subspace. Residual components capture systematic representational failure, and candidate conceptual extensions are restricted to low-rank, admissible transformations. I show that any MDL-accepted extension can be chosen so that its novel directions lie entirely within the residual span induced by experience, while extensions orthogonal to this span strictly increase description length and are therefore rejected. This yields a conservative account of imagination and conceptual innovation. Internally generated counterfactual representations contribute to learning only insofar as they expose or amplify structured residual error, and cannot introduce arbitrary novelty. I further distinguish representational counterfactuals--counterfactuals over an agent's conceptual basis--from causal or value-level counterfactuals, and show how MDL provides a normative selection principle governing representational change. Overall, the framework characterizes conceptual development as an error-driven, geometry-constrained process of basis extension, clarifying both the role and the limits of imagination in learning and theory change.
- Europe > Ukraine > Kyiv Oblast > Kyiv (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Thailand > Pathum Thani > Pathum Thani (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.35)
Developing a Thailand solar irradiance map using Himawari-8 satellite imageries and deep learning models
Suwanwimolkul, Suwichaya, Tongamrak, Natanon, Thungka, Nuttamon, Hoonchareon, Naebboon, Songsiri, Jitkomut
Thailand has targeted to achieve carbon neutrality by 2050 when the power grid will need to accommodate 50% share of renewable electricity generation capacity; see [Ene21]. The most recent draft of Power Development Plan 2024 (PDP2024) for 2024 - 2037 from [Ene24] proposes to add a new solar generation capacity of approximately 24,400 MWp (more than 4 times the amount issued in the previous Alternative Energy Development Plan 2015-2036 (AEDP2015) at 6,000 MWp, shown in [Dep15, p.9]. This amount does not yet include behind-the-meter, self-generation solar installed capacities of the prosumers, which is expected to increase at an accelerating rate. Solar integration into the power grid with such a sharprising amount will pose technical challenges to the operation and control of the transmission and distribution networks, carried out by the transmission system operator (TSO) and distribution system operator (DSO), as presented in [OB16]. Hence, TSO in Thailand will need an effective means to estimate the solar power generation across the entire transmission network, on an hourly basis, or even finer time resolution, to provide economic hour-to-hour generation dispatch for load following the total net load of the transmission, and to prepare sufficient system flexibility (i.e., ramp-rate capability of the thermal and hydropower plants, or energy storage systems) to cope with the net load fluctuation due to solar generation intermittency for maintaining system frequency stability, concurrently, in its operation. For DSO, a significant amount of reverse power flow when self-generation from solar exceeds self-consumption can lead to technical concerns of voltage regulation and equipment overloading problems. The near real-time estimation of solar generation in each distribution area will enable DSO to activate proper network switching or reconfiguring to mitigate such fundamental concerns to ensure its reliable operation.
- North America > United States (0.67)
- Oceania > Australia (0.28)
- Asia > Middle East > UAE (0.14)
- (42 more...)
- Energy > Renewable > Solar (1.00)
- Energy > Power Industry (1.00)
- Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.50)
- Government > Regional Government > North America Government > United States Government (0.46)
Personalised 3D Human Digital Twin with Soft-Body Feet for Walking Simulation
Loke, Kum Yew, Chan, Sherwin Stephen, Lei, Mingyuan, Johan, Henry, Zuo, Bingran, Ang, Wei Tech
With the increasing use of assistive robots in rehabilitation and assisted mobility of human patients, there has been a need for a deeper understanding of human-robot interactions particularly through simulations, allowing an understanding of these interactions in a digital environment. There is an emphasis on accurately modelling personalised 3D human digital twins in these simulations, to glean more insights on human-robot interactions. In this paper, we propose to integrate personalised soft-body feet, generated using the motion capture data of real human subjects, into a skeletal model and train it with a walking control policy. Through evaluation using ground reaction force and joint angle results, the soft-body feet were able to generate ground reaction force results comparable to real measured data and closely follow joint angle results of the bare skeletal model and the reference motion. This presents an interesting avenue to produce a dynamically accurate human model in simulation driven by their own control policy while only seeing kinematic information during training.
- Asia > Singapore (0.05)
- Asia > Thailand > Pathum Thani > Pathum Thani (0.04)
- Health & Medicine > Pharmaceuticals & Biotechnology (0.46)
- Health & Medicine > Therapeutic Area > Neurology (0.46)
- Health & Medicine > Therapeutic Area > Musculoskeletal (0.46)
V-RoAst: A New Dataset for Visual Road Assessment
Jongwiriyanurak, Natchapon, Zeng, Zichao, Goo, June Moh, Wang, Xinglei, Ilyankou, Ilya, Srirrongvikrai, Kerkritt, Wang, Meihui, Haworth, James
Road traffic crashes cause millions of deaths annually and have a significant economic impact, particularly in low- and middle-income countries (LMICs). This paper presents an approach using Vision Language Models (VLMs) for road safety assessment, overcoming the limitations of traditional Convolutional Neural Networks (CNNs). We introduce a new task ,V-RoAst (Visual question answering for Road Assessment), with a real-world dataset. Our approach optimizes prompt engineering and evaluates advanced VLMs, including Gemini-1.5-flash and GPT-4o-mini. The models effectively examine attributes for road assessment. Using crowdsourced imagery from Mapillary, our scalable solution influentially estimates road safety levels. In addition, this approach is designed for local stakeholders who lack resources, as it does not require training data. It offers a cost-effective and automated methods for global road safety assessments, potentially saving lives and reducing economic burdens.
- Oceania > New Zealand > South Island > Otago > Dunedin (0.04)
- Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
- Europe > Greece (0.04)
- (3 more...)
How False Data Affects Machine Learning Models in Electrochemistry?
Deshsorna, Krittapong, Lawtrakul, Luckhana, Iamprasertkun, Pawin
Recently, the selection of machine learning model based on only the data distribution without concerning the noise of the data. This study aims to distinguish, which models perform well under noisy data, and establish whether stacking machine learning models actually provide robustness to otherwise weak-to-noise models. The electrochemical data were tested with 12 standalone models and stacking model. This includes XGB, LGBM, RF, GB, ADA, NN, ELAS, LASS, RIDGE, SVM, KNN, DT, and the stacking model. It is found that linear models handle noise well with the average error of (slope) to 1.75 F g-1 up to error per 100% percent noise added; but it suffers from prediction accuracy due to having an average of 60.19 F g-1 estimated at minimal error at 0% noise added. Tree-based models fail in terms of noise handling (average slope is 55.24 F g-1 at 100% percent noise), but it can provide higher prediction accuracy (lowest error of 23.9 F g-1) than that of linear. To address the controversial between prediction accuracy and error handling, the stacking model was constructed, which is not only show high accuracy (intercept of 25.03 F g-1), but it also exhibits good noise handling (slope of 43.58 F g-1), making stacking models a relatively low risk and viable choice for beginner and experienced machine learning research in electrochemistry. Even though neural networks (NN) are gaining popularity in the electrochemistry field. However, this study presents that NN is not suitable for electrochemical data, and improper tuning resulting in a model that is susceptible to noise. Thus, STACK models should provide better benefits in that even with untuned base models, they can achieve an accurate and noise-tolerant model. Overall, this work provides insight into machine learning model selection for electrochemical data, which should aid the understanding of data science in chemistry context.
- Energy (0.68)
- Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.46)
StitchNet: Composing Neural Networks from Pre-Trained Fragments
Teerapittayanon, Surat, Comiter, Marcus, McDanel, Brad, Kung, H. T.
We propose StitchNet, a novel neural network creation paradigm that stitches together fragments (one or more consecutive network layers) from multiple pre-trained neural networks. StitchNet allows the creation of high-performing neural networks without the large compute and data requirements needed under traditional model creation processes via backpropagation training. We leverage Centered Kernel Alignment (CKA) as a compatibility measure to efficiently guide the selection of these fragments in composing a network for a given task tailored to specific accuracy needs and computing resource constraints. We then show that these fragments can be stitched together to create neural networks with accuracy comparable to that of traditionally trained networks at a fraction of computing resource and data requirements. Finally, we explore a novel on-the-fly personalized model creation and inference application enabled by this new paradigm. The code is available at https://github.com/steerapi/stitchnet.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Pennsylvania > Lancaster County > Lancaster (0.04)
- Asia > Thailand > Pathum Thani > Pathum Thani (0.04)
Framework for inferring empirical causal graphs from binary data to support multidimensional poverty analysis
Amornbunchornvej, Chainarong, Surasvadi, Navaporn, Plangprasopchok, Anon, Thajchayapong, Suttipong
Poverty is one of the fundamental issues that mankind faces. To solve poverty issues, one needs to know how severe the issue is. The Multidimensional Poverty Index (MPI) is a well-known approach that is used to measure a degree of poverty issues in a given area. To compute MPI, it requires information of MPI indicators, which are \textbf{binary variables} collecting by surveys, that represent different aspects of poverty such as lacking of education, health, living conditions, etc. Inferring impacts of MPI indicators on MPI index can be solved by using traditional regression methods. However, it is not obvious that whether solving one MPI indicator might resolve or cause more issues in other MPI indicators and there is no framework dedicating to infer empirical causal relations among MPI indicators. In this work, we propose a framework to infer causal relations on binary variables in poverty surveys. Our approach performed better than baseline methods in simulated datasets that we know ground truth as well as correctly found a causal relation in the Twin births dataset. In Thailand poverty survey dataset, the framework found a causal relation between smoking and alcohol drinking issues. We provide R CRAN package `BiCausality' that can be used in any binary variables beyond the poverty analysis context.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > California > Orange County > Irvine (0.14)
- Europe > Austria > Vienna (0.14)
- (13 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Health & Medicine > Therapeutic Area (1.00)
- Government (0.93)
- Education (0.67)
Improving Data Transfer Efficiency for AIs in the DareFightingICE using gRPC
Nimpattanavong, Chollakorn, Khan, Ibrahim, Van Nguyen, Thai, Thawonmas, Ruck, Choensawat, Worawat, Sookhanaphibarn, Kingkarn
This paper presents a new communication interface for the DareFightingICE platform, a Java-based fighting game focused on implementing AI for controlling a non-player character. The interface uses an open-source remote procedure call, gRPC to improve the efficiency of data transfer between the game and the AI, reducing the time spent on receiving information from the game server. This is important because the main challenge of implementing AI in a fighting game is the need for the AI to select an action to perform within a short response time. The DareFightingICE platform has been integrated with Py4J, allowing developers to create AIs using Python. However, Py4J is less efficient at handling large amounts of data, resulting in excessive latency. In contrast, gRPC is well-suited for transmitting large amounts of data. To evaluate the effectiveness of the new communication interface, we conducted an experiment comparing the latency of gRPC and Py4J, using a rule-based AI that sends a kick command regardless of the information received from the game server. The experiment results showed not only a 65\% reduction in latency but also improved stability and eliminated missed frames compared to the current interface.
Variable-lag Granger Causality for Time Series Analysis
Amornbunchornvej, Chainarong, Zheleva, Elena, Berger-Wolf, Tanya Y.
Granger causality is a fundamental technique for causal inference in time series data, commonly used in the social and biological sciences. Typical operationalizations of Granger causality make a strong assumption that every time point of the effect time series is influenced by a combination of other time series with a fixed time delay. However, the assumption of the fixed time delay does not hold in many applications, such as collective behavior, financial markets, and many natural phenomena. To address this issue, we develop variable-lag Granger causality, a generalization of Granger causality that relaxes the assumption of the fixed time delay and allows causes to influence effects with arbitrary time delays. In addition, we propose a method for inferring variable-lag Granger causality relations. We demonstrate our approach on an application for studying coordinated collective behavior and show that it performs better than several existing methods in both simulated and real-world datasets. Our approach can be applied in any domain of time series analysis.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
A nonparametric framework for inferring orders of categorical data from category-real ordered pairs
Amornbunchornvej, Chainarong, Surasvadi, Navaporn, Plangprasopchok, Anon, Thajchayapong, Suttipong
Given a dataset of careers and incomes, how large a difference of income between any pair of careers would be? Given a dataset of travel time records, how long do we need to spend more when choosing a public transportation mode $A$ instead of $B$ to travel? In this paper, we propose a framework that is able to infer orders of categories as well as magnitudes of difference of real numbers between each pair of categories using Estimation statistics framework. Not only reporting whether an order of categories exists, but our framework also reports the magnitude of difference of each consecutive pairs of categories in the order. In large dataset, our framework is scalable well compared with the existing framework. The proposed framework has been applied to two real-world case studies: 1) ordering careers by incomes based on information of 350,000 households living in Khon Kaen province, Thailand, and 2) ordering sectors by closing prices based on 1060 companies' closing prices of NASDAQ stock markets between years 2000 and 2016. The results of careers ordering show income inequality among different careers. The stock market results illustrate dynamics of sector domination that can change over time. Our approach is able to be applied in any research area that has category-real ordered pairs. Our proposed "Dominant-Distribution Network" provides a novel approach to gain new insight of analyzing category orders. The software of this framework is available for researchers or practitioners within R package: EDOIF.
- Asia > Thailand > Khon Kaen > Khon Kaen (0.25)
- North America > United States > California (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (4 more...)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Communications (0.68)